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, Abstract: False discovery rate (FDR) has been widely used as an error mea- 

&J[)i sure in large scale multiple testing problems, but most research in the area has 

^ ' been focused on procedures for controlling the FDR based on independent test 

' statistics or the properties of such procedures for test statistics with certain 

^ ' types of stochastic dependence. Based on an approach proposed in Tang and 

, Zhang (2005), we further develop in this paper empirical Bayes methods for 

controlling the FDR with dependent data. We implement our methodology in 
a time series model and report the results of a simulation study to demonstrate 
' the advantages of the empirical Bayes approach. 

1. Introduction 

B 

C/3 ', The false discovery rate (FDR), proposed by Benjamini and Hoclrberg {[l], BH 

hereafter), has been widely used as an error measure in multiple testing problems. 
Let R be the number of rejected hypotheses (discovered items) and V be the number 
^ ' of falsely rejected hypotheses, the FDR is defined as 

(1.1) FT)R = E{V/R)l{R>o}- 



o 



X 



Since most discovered items truly contain signals when the ratio V/ R is small, FDR 
controlling methods allow a scientific inquiry to move ahead from a screening ex- 
QP I periment to more focused systematic investigations of discovered items. Moreover, 

since FDR controlling methods do allow a small number of items without signal to 
I slip through, they often provide sufficiently many discovered items in such screening 

experiments. Thus, the FDR seems to strike a suitable balance between the needs 
of multiple error control and sufficient discovery power in many applications (e.g. 
microarray, imaging, and astrophysics studies), compared with the more conserva- 
H I tive family-wise error rate (FWER) P{V > 0) and more liberal per-comparison 

error rate (PCER) EV/m, where m is the total number of hypotheses being tested. 
Mathematically, we notice that 

PCER< FDR <FWER. 

The FDR is closely related to the positive predictive value (PPV) for diagnostic 
tests, since 

(1.2) PPV=l-l^/i? 
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based on ground truth. 

Most papers in the FDR hterature have focused on procedures for controlhng 
the FDR based on independent test statistics or properties of such procedures 
with dependent test statistics. For example, in BH 1], the p- values for individual 
hypotheses are assumed to be independent and uniformly distributed under the 
null. Benjamini and Yekutieli 3] proved that the BH [1] procedure still controls the 
FDR under a "positive regression dependence" condition on test statistics under 
the null hypotheses. Storey, Taylor and Sigmund [? ] proposed certain modification 
of the BH [l| procedure and proved its FDR controlling properties under conditions 
on the convergence of the empirical processes of the p-values. For related work in 
controlling decision errors in multiple testing, see Cohen and Sackrowitz ,4|, 
Finner and Roters Sarkar 11 1 and Simes 12], in addition to references cited 



elsewhere in this paper. 

In Tang and Zhang [l^ , we showed that the optimal solution of the problem of 



(1.3) 



maximizing ER subject to _E(y/i?)/j^>Q| < a 



may produce undesirable multiple testing procedures, and proposed Bayes and em- 
pirical Bayes (EB) approaches under a conditional version of (jl.Sp given certain 
test statistics which determine R. This means to maximize the total discovery R 
(i.e. power) of multiple testing procedures subject to a preassigned level of the 
conditional FDR: 

(1.4) CFDR(X) EEi;[F/(i?Vl)|X], 

for certain statistics X satisfying £'[i?|X] — R, where (x V y) = ma.x{x,y). These 
approaches, which provide a general framework for controlling the FDR with de- 
pendent data, are discussed in Section 2. 

We note that the concept of conditional FDR (|1.4p allows conditioning on a 
mixture of observations, parameters, and variables generated by statisticians to 
implement randomized rules. The CFDR (|1.4p becomes posterior FDR when X is 
the vector of all observations of concern. If we observe Xq and generate variables Eq 
to execute a randomized multiple testing rule, the constraint ii'[_R|X] = R demands 
X = (Xo,eo) in our optimization problem, cf. (j2.ip below, but the posterior FDR 
is computed given Xg alone for such randomized tests. 

In Section 3, we develop EB methods for controlling the (conditional) FDR in 
a time series model based on our approach. In Section 4, we present simulation 
results which demonstrate the advantages our method compared with the BH [)\ 
rule based on marginal p- values and the additional knowledge of the proportion of 
true hypotheses. 



2. Bayes and empirical Bayes approaches 

Let Hi, ... , Hjn, be the null hypotheses to be tested and 

6i — I{Hi is not true}, 5i = I{Hi is rejected}. 

In a Bayes approach, we treat 9 — (0i, . . . , 9.„i) as a random vector and assume that 
for certain observations X (not necessarily all observations), the joint distribution 
of {9, X} is known given the knowledge of the joint prior distribution of 9 and all 
nuisance parameters (if any). Consider the problem of 

(2.1) maximizing R subject to E[V/{R V 1)|X] < a and £'[i?|X] = R, 
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where R — J2™^i ^^^^ ^ — Sl^Li — Let tt^ — P{9i = 0|X} be the posterior 
probabihty of Hi given X. The Bayes rule, which solves (|2.ip . is given by 

1 

(2.2) %)=/{i<fc*}, fc* EE fc*(a) EE niax|fc < m : -^TTf,) < a|, 

with the convention max0 = 0, where {(1), . . . , (jn)} is the ordering of {1, . . . , m} 
determined by tt^j^) < • • • < '"'(m)- The above Bayes solution is optimal for any type 
of data as long as the joint distribution of (X, 9) is available. 

The Bayes rule (|2.2p provides the most powerful solution for controlling the (con- 
ditional) FDR in the sense of (|2.ip with general dependent test statistics for general 
null hypothesis, provided the knowledge of the joint prior distribution of unknown 
parameters and the computational feasibility for the conditional probabilities of 
the hypotheses Hi given X. The Bayes rule yields R = k* , since it rejects Hi iff 
T^i < T^{k*)- It clearly controls the FDR at level a, since the constraint in p.ip is 
stronger than FDR = E[V/{R V 1)] < a. 

In applications where the full knowledge of the joint prior distribution is not 
available or the computation of the posterior probability of the null hypotheses 
given data is not feasible, the Bayes rule (|2.2|) motivates empirical Bayes rules of 
the form 

1 

(2.3) 6[i] = I{i < fc}, fc = max |fc : — n^i] < a|, 

where tt^ are estimates of the posterior probability tt^ ~ P{iJi|X} and {[!],..., [m]} 
is an estimate of the ordering {(1), . . . , (m)} in (|2.2p . The performance of the Bayes 
rule serves as a benchmark, as the goal of the EB (|2.3p here is to approximately 
achieve optimality in the sense of (|2.ip . This EB approach is applicable for depen- 
dent data as long as suitable estimates tt^ can be obtained. The ordering [i] can be 
derived from tTj if the ordering (?) is not a known functional of the data. 

Tang and Zhang ^iB'l proposed the above Bayes and empirical Bayes approaches 
and proved the asymptotic optimality of the BH [l[ procedure as EB in the sense 
of (|2.ip . In the rest of the section, we discuss implementation of the EB method 
and some related work on Bayes and EB aspects of FDR problems. 

Suppose that the expectation in (j2.ip depends on an unknown parameter ^, so 
that the main constraint in ()2.ip becomes CFDR(X,^) < a. Let 

(2.4) ^T,{X,O^P{^^^0\X,^} 

denote the conditional probabilities of the null hypotheses Hi given statistics X 
and parameters ^. If /(X|^) belongs to a regular parametric family, we may use 
an EB rule with an estimate ^ (e.g. the MLE) of ^ and tt^ = 7r(X,^) in (|2.3p . 
or a hierarchical Bayes rule with a prior on ^ and T^i — J 7''i(X, i^)/(^|X)i^((i^) in 
()2.2p . If {di, Xi), . . . , {6m, Xm) are independent vectors for certain test statistics 
Xi, the conditional probabilities 7ri(X, i^) — Tri{Xi,S^) or the average k~^ ^(j) 
of their ordered values in ()2.2p may still be estimated sufficiently accurately even 
for high-dimensional ^, e.g. the asymptotic optimality BH [1] rule as EB. How do 
we implement p.3p when (j2.4p depends on many components of X and ^ is high- 
dimensional? We propose EB rules (|2.3p with 

(2.5) = 7r„,,(r„,,(X),0, 7r„,,(T„,,(X), = P{0: = 0|r™,,(X), , 
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with certain lower-dimensional statistics Tm^iCK.) which are informative about 9i. 
This can be also viewed as the EB version of the approximate Bayes rule 

(2.6) (5(iy — I{i < fc*}, k* = max |fc < m : — T^m,{i)' 

i=l 

where iTm,{i)' the ordered values of 7rm,i(T'm,i(X), ^). In applications with time 
series, image, or networks data, Tm^iC^) are typically composed of observations 
"near" the location of the i-ih hypotheses Hi. The idea is to reduce the dimen- 
sionality of the function tt^ to be estimated. In the time series example in Section 
3 and certain models of Markov random fields, T^m,,i{Tm,i{'^),£,) depend on ^ only 
through a lower-dimensional functional of ^, so that the complexity of the esti- 
mation problem is further reduced. The cost of such dimension reduction is the 
bias 

We note that Ebm,i = and Var(6,„.i) decreases when we add more variables to 
the vector Tm,i(X). 

Efron et al. [6[ proposed a different EB approach based on the conditional prob- 
ability fdr(a;) = P{di = 0\Xi = x} given a univariate statistic Xi, called local 
fdr, and developed multiple-testing methodologies based on certain estimate of it 
in mixture models. Efron and Tibshirani Q and Efron ^ further developed this 
EB approach using an integrated version of local fdr. These notions of FDR and 
related quantities have been studied in Storey and Genovese and Wasserman 
[lot . Sarkar and Zhou (personal communication) recently considered the posterior 
FDR (i.e. the conditional FDR given all observed data) for a number of multiple 
testing procedures, including a randomized version of (12. 2p . 

3. EB methods in a time series model 

In this section, we develop EB methods for controlling the (conditional) FDR in 
the time series model 

(3.1) Xi = Hi + a, 

where {ei} is a stationary Gaussian process (e.g. moving average) with 

(3.2) Ee^^O, Cov(e„e,+fc) =7(fc). 

Our problem is to test Hi : fii = versus Ki : fii ^ 0, i — 1, . . . ,to, based on 
observations X = {Xi, . . . , X„i)- We assume that the null distribution of Xi ~ 
iV(0,7(0)) is known. We set 7(0) = 1 without loss of generality. 

We derive an EB procedure ()2.3|) of the form (I2.5|l under a nominal mixture 
model in which {0i, /i^) are iid vectors with 

(3.3) fi, I {9, = 0) = 0, I iO^ = 1) ~ ^(^7, r^) 

for certain unknown {rj^T^). We assume X)fc')'^(^) ^ which allows us to take 
advantage of the diminishing correlation. This leads to 

(3.4) T„,,(X) = (X„|j-*| <fc), e= ('7,T2,u;o,7(l),---,7(fc)), 
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in ((231) . where wq = P{6'i = 0}. 

As we have mentioned in Section 2, conditioning on the lower-dimensional statis- 
tics T„j_i(X) is helpful in two important ways: computational feasibility and reduc- 
tion of the set of nuisance parameters involved. Since the components of X are all 
correlated, the posterior probability 7ri(X, ^) in (|2.4p and thus the Bayes rule k* in 
(|2.2p demand the inversion of m-dimensional conditional covariance matrices and 
summation over 2™ possible values of 9. Thus, the Bayes rule is computationally 
intractable. Exact computation of 7r,„_i(T„j_i(X), ^) in (|2.5p is much more manage- 
able since it involves the inversion of (2fc -I- l)-dimensional covariance matrices and 
summation over 2^'^"'"^ possible values of T,n,i{d) = {Oj, \j — i\ < k) . Also, the con- 
ditional probability 7r„i_i(r„i_i(X), i^) of Hi given Tm.iOQ does not depend on 7(j) 
for j > k, so that the dimensionality of ^ is much smaller than the sample size if 
we choose k — o{m). 

Given wq, we estimate ^ as follows: 



(3.5) 5^= 1 Vx./(l-«;o), 

m — ' 

1=1 



1 - Wo ^ (1 - p)2m2/2 

= 1 ' l<i<i<m,i-i>pm ^ ' ' 



(3.7) 7(j) 



l<'i<j'<m,j— i>pm 

\ ^ XiXij^j \ ^ XiXj 



^ 771—7 ^ {\ — pYm? 12^ 

1=1 ■' l<i<j<m,j-i>pm ^ ' 



where < p < 1. Estimates (|3.5p . (|3.6p and (13. 7p are based on the method of 
moments, since ((3?T|) . (|3^ and (|3?3| imply £'X, = E^ii = (1 - wo)r], EXf = 
l + E^ii - 1 + (1-77;o)(t2+7?2), EX,X,+j^-i(3) + {Eiiif, and EX,X, « (£;pi)2 
for large \ j — i\. In order to reduce the bias [composed of terms involving 7(j — ?)], 
we use the average of XiXj with \i — i\ > pm in p.6p and (|3.7p to estimate 
(i?Xi)^ = {EfXi)^^ instead of (X^I^i ^i/™)^- the simulation study discussed in 
Section 4, we take p = 0.1. 

For the estimation of the proportion wn of null hypotheses, we use a Fourier 
method (Tang and Zhang [l^ and Zhang [3]) and its parametric bootstrap version. 
The Fourier method estimates wq by 

(3.8) wf^ = - y V'(X,; /7™), 7^(z; h) = / h^o{ht)e'' cos{zt)dt, 

777 — ' / 

i=l 

where V'o is a density function with support [—1, 1] and hm — {K(log 777)}~^/^ is the 
bandwidth, k < 1. This estimator is derived from 



E 



ipo{t) cos{{fii/h)t)dt 



1, fM 0, ft, > 0, 

0, m / 0, ft -> 0- 



by Ricmann-Lebesgue. For the bootstrap version of ()3.8p . we generate bootstrap 
samples of X under the parameter value of ^ in (|3.5p - (|3.8p and then estimate wa 

by 
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where Wg is the average of the estnuator (|3.8[) based on the bootstrap samples. In 
the simulation study, we use uniform [—1, 1] as V'o and k = 1/2 for (|3.8p . and we 
bootstrap 100 times for ()3.9|) . 

The Empirical Bayes procedure rejects the null hypotheses associated with the 
first k smallest estimated conditional probabilities 7rm,i(T'rri,z(X), ^ ), with 

(3.10) fc = max |fc : — 7f[j] < a|, 

4=1 

where ttji] < ••• < T^[m\ are ordered values of 7r„i,i(T,„_i(X), ^ ), and ^ is defined 
through (|3.5p - (|3.8p with the alternative of using the bootstrap estimate (|3.9p for 
wq. The conditional probability 

^m,4(T™,,(X),e) =P{e, = 0|T„,,(X),e}, 

with ^ replaced by ^, is computed by conditioning on Tm,i{0) — {0j, \ j — i\ < k). 
To save notation, we may drop the subscript m in the rest of the paragraph, e.g. 
Tm,i = Ti. Under the nominal mixture model p.3p . the conditional joint distribution 
of T'i(X) is multivariate normal 



T,{x)\T,{e) ^ N(f]T,{e),j:,{e)y 



where the covariance matrix Yii{9) — Cov(Ti(e)) + T^diag(T!i(6')) depends on un- 
known parameters Ti{9), {^{j) ■ I < j < k} and r^. The joint density of this 
conditional distribution is 



f^{^^\T^{9) 



exp [ - (v, - r;T,(0)){I],(g)}-i(v, - vW))' /2] 
(2^)'*'/2{det (S,(0))}i/2 



where di — #{j : |j — i| < k} is the dimensionality of T!i(X), ranging from 1 + k 
to 2fc + 1 depending on if i is close to the endpoints i — I and i — m, and are 
di-dimensional row vectors. This gives 

f'^ll) 7r,„,i(T,„,i(X),^) 

_ ET,(g)gn^°'/^(^^(X)|T.(^))z^g'-^'^^^(l-u;o)-'W 

~ ET.Wen./.(7^»(X)|r.(0))«;„^*-^-('')(l-^«o)-W ' 

where s,(0) = T.\j-A<k ^» = {0' ^V'^ and flf^ = {T,(0) e n,: 0^ ^ 0}. 

The estimation of the proportion of null hypotheses is an important aspect of the 
FDR problem. Benjamini and Hochberg [1] simply used the conservative = 1 to 
control the FDR at level a, while Benjamini and Hochberg la] suggested the possi- 
bility of power enhancement with estimated wq in the BH lll| procedure. Different 
estimators of wq were proposed by Storey [13] and Storey, Taylor and Siegmund 
[3] based on the tail proportion of p-values, and by Efron et al. based on 
minj:{/(x)//o(a;)}, in the context of controlling the FDR. Tang and Zhang proved 
the consistency of (j3.8p for normal mixtures. 



4. Simulation results 



In this section, we describe the results of our simulation study. We compare five 
procedures: the BH 1] rule using the true Wq, the approximate Bayes rule (|2.6p . and 
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three EB rules ([3l0)) using r,„,,(X) = {X^, \j -i\<2), estimators ([33]), jSH) and 
(|3.7p . and the following three values for (estimated) wq: the true wq, the Fourier 
estimator p.8|) . and the bootstrap estimator ()3.9|) . We denote the BH procedure by 
BH-wq, and the three EB rules by EB-wo, EB-Fouricr and EB-bootstrap respec- 
tively. The target (conditional) FDR level is a = 0.1 throughout the simulation 
study. 

Simulation data are generated as follows: m = 1000, 

#{i : M» = 0} = 900, #{i : /i. = 2} = 100, 

i.e. Wo = 0.9, and 7 = (1,0.6,0.4,0.2,0.1,0,0,...), e.g. 7(1) = 0.6. We note that 
this is a singular point in the nominal mixture model IjS.Sp for the derivation of EB 
procedures. A realization of the simulated X is plotted in Figure [T] 

We plot the simulated pairs of {V/R, R), i.e. the proportion of false rejections in 
the X-axis and the total number of rejections in the y-axis, for the BH-wq and the 
three EB procedures in Figure [U with dashed lines at the means of the simulated 
data. The mean and standard deviation of V/R, R, and V for all five procedures 
are given in Table 1. It is clear that the EB procedures are much more powerful 
than BH as they have much higher number R of rejections, while the false discovery 
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Fig 2. The proportion of false rejections (x-axis) and the total number of rejections (y-axis) for 
BH-wo, EB-wo, EB-Fourier, and EB-bootstrap, clockwise from top-left; solid vertical lines for the 
target FDR of a = 10% and dashed lines for the means of simulated points in the plots. 



ratio V/ R are similar among EB and BH procedures. 



5. Conclusion 

For multiple testing problems, we describe the Bayes and empirical Bayes ap- 
proaches for controlling the (conditional) FDR proposed in Tang and Zhang [15j . 
While these approaches are completely general for dependent data, its implemen- 
tation is subject to computational feasibility and the availability of sufficient infor- 
mation for the estimation of certain conditional probabilities of the individual null 
hypotheses. We propose in this paper to use the conditional probabilities of the 
null hypotheses given certain low-dimensional statistics to ease the computational 
burden and possibly to reduce the number of nuisance parameters involved. We im- 
plement this EB approach in a time series model with general stationary Gaussian 
errors. Simulation results demonstrate that the EB procedures have much high 
number of correct rejections and similar false rejection ratio compared with an 
application of the procedure of Benjamini and Hochberg [T] based on individual p- 
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Table 1 

Mean (p.) and Standard Deviation (a) ofV/R, R, and V 





V/R 


R 


V 




Procedure 


mean 


SD 


mean 


SD 


mean 


SD 


BH-w)o 


0.11 


0.10 


13.52 


7.57 


1.66 


1.82 


Approximate Bayes 


0.12 


0.03 


76.86 


6.62 


9.12 


3.16 


EB-«)o 


0.10 


0.09 


34.42 


7.95 


3.27 


3.05 


EB-Foureir 


0.13 


0.11 


34.49 


13.44 


5.06 


4.95 


EB-bootstrap 


0.13 


0.12 


35.59 


15.05 


5.89 


6.56 



values. This clearly demonstrates the feasibility and superior power of the proposed 
EB approach for controlling the false rejection with dependent data. 
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